Tongyi DeepResearch AI News List

predict.info — Premium Domain For Sale Domain only: USD 200,000. Prediction platform technology priced separately. predict.info

Inquire

AI News List

List of AI News about Tongyi DeepResearch

Time	Details
2025-10-30 10:00	Alibaba's Tongyi DeepResearch AI Agent Surpasses GPT-4o and DeepSeek-V3 in Deep Research Using Only 3.3B Active Parameters According to @godofprompt, Alibaba has released Tongyi DeepResearch, a 30B parameter open-source AI agent that outperforms GPT-4o and DeepSeek-V3 in deep research tasks while using just 3.3B active parameters (source: https://twitter.com/godofprompt/status/1983836518067401208). Unlike the industry trend of scaling to 600B+ parameters, Alibaba's innovation lies in its training approach. The model introduces 'agentic mid-training,' an intermediate phase that teaches the AI how to act as an agent before learning specific tasks, bridging the gap between language pre-training and task-specific post-training. This paradigm shift addresses the alignment issues seen in traditional supervised fine-tuning and reinforcement learning. All training data is AI-generated, with no human annotation, and includes complex, multi-hop reasoning samples. The model achieves state-of-the-art results: 32.9% on Humanity's Last Exam, 43.4% on BrowseComp, and 75% on xbench-DeepSearch. Remarkably, training was done on just two H100 GPUs for two days at under $500 per task. This demonstrates significant business opportunities for cost-efficient, high-performing AI agents and signals a shift toward smarter training over brute-force scaling (source: arxiv.org/abs/2510.24701; github.com/Alibaba-NLP/DeepResearch). Source

Time

Details

2025-10-30
10:00

Alibaba's Tongyi DeepResearch AI Agent Surpasses GPT-4o and DeepSeek-V3 in Deep Research Using Only 3.3B Active Parameters

According to @godofprompt, Alibaba has released Tongyi DeepResearch, a 30B parameter open-source AI agent that outperforms GPT-4o and DeepSeek-V3 in deep research tasks while using just 3.3B active parameters (source: https://twitter.com/godofprompt/status/1983836518067401208). Unlike the industry trend of scaling to 600B+ parameters, Alibaba's innovation lies in its training approach. The model introduces 'agentic mid-training,' an intermediate phase that teaches the AI how to act as an agent before learning specific tasks, bridging the gap between language pre-training and task-specific post-training. This paradigm shift addresses the alignment issues seen in traditional supervised fine-tuning and reinforcement learning. All training data is AI-generated, with no human annotation, and includes complex, multi-hop reasoning samples. The model achieves state-of-the-art results: 32.9% on Humanity's Last Exam, 43.4% on BrowseComp, and 75% on xbench-DeepSearch. Remarkably, training was done on just two H100 GPUs for two days at under $500 per task. This demonstrates significant business opportunities for cost-efficient, high-performing AI agents and signals a shift toward smarter training over brute-force scaling (source: arxiv.org/abs/2510.24701; github.com/Alibaba-NLP/DeepResearch).

Source